Overview

Brought to you by YData

Dataset statistics

Number of variables8
Number of observations887
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory166.5 KiB
Average record size in memory192.2 B

Variable types

Categorical3
Text1
Numeric4

Alerts

Sex is highly overall correlated with SurvivedHigh correlation
Survived is highly overall correlated with SexHigh correlation
Name has unique valuesUnique
Siblings/Spouses Aboard has 604 (68.1%) zerosZeros
Parents/Children Aboard has 674 (76.0%) zerosZeros
Fare has 15 (1.7%) zerosZeros

Reproduction

Analysis started2024-09-14 03:14:33.479977
Analysis finished2024-09-14 03:14:39.242038
Duration5.76 seconds
Software versionydata-profiling vv4.10.0
Download configurationconfig.json

Variables

Survived
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size50.4 KiB
0
545 
1
342 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters887
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Length

2024-09-14T03:14:39.299776image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-09-14T03:14:39.393037image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Most occurring characters

ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 545
61.4%
1 342
38.6%

Pclass
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size50.4 KiB
3
487 
1
216 
2
184 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters887
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row1
3rd row3
4th row1
5th row3

Common Values

ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Length

2024-09-14T03:14:39.491199image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-09-14T03:14:39.586094image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Most occurring characters

ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 887
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
3 487
54.9%
1 216
24.4%
2 184
 
20.7%

Name
Text

UNIQUE 

Distinct887
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size71.4 KiB
2024-09-14T03:14:39.813015image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Length

Max length81
Median length50
Mean length25.322435
Min length11

Characters and Unicode

Total characters22461
Distinct characters58
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique887 ?
Unique (%)100.0%

Sample

1st rowMr. Owen Harris Braund
2nd rowMrs. John Bradley (Florence Briggs Thayer) Cumings
3rd rowMiss. Laina Heikkinen
4th rowMrs. Jacques Heath (Lily May Peel) Futrelle
5th rowMr. William Henry Allen
ValueCountFrequency (%)
mr 513
 
14.5%
miss 182
 
5.1%
mrs 125
 
3.5%
william 62
 
1.8%
john 44
 
1.2%
master 40
 
1.1%
henry 33
 
0.9%
james 24
 
0.7%
charles 23
 
0.7%
george 22
 
0.6%
Other values (1482) 2468
69.8%
2024-09-14T03:14:40.214756image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2649
 
11.8%
r 1914
 
8.5%
e 1654
 
7.4%
a 1618
 
7.2%
i 1287
 
5.7%
s 1281
 
5.7%
n 1273
 
5.7%
M 1105
 
4.9%
l 1039
 
4.6%
o 986
 
4.4%
Other values (48) 7655
34.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22461
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2649
 
11.8%
r 1914
 
8.5%
e 1654
 
7.4%
a 1618
 
7.2%
i 1287
 
5.7%
s 1281
 
5.7%
n 1273
 
5.7%
M 1105
 
4.9%
l 1039
 
4.6%
o 986
 
4.4%
Other values (48) 7655
34.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22461
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2649
 
11.8%
r 1914
 
8.5%
e 1654
 
7.4%
a 1618
 
7.2%
i 1287
 
5.7%
s 1281
 
5.7%
n 1273
 
5.7%
M 1105
 
4.9%
l 1039
 
4.6%
o 986
 
4.4%
Other values (48) 7655
34.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22461
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2649
 
11.8%
r 1914
 
8.5%
e 1654
 
7.4%
a 1618
 
7.2%
i 1287
 
5.7%
s 1281
 
5.7%
n 1273
 
5.7%
M 1105
 
4.9%
l 1039
 
4.6%
o 986
 
4.4%
Other values (48) 7655
34.1%

Sex
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size53.6 KiB
male
573 
female
314 

Length

Max length6
Median length4
Mean length4.7080045
Min length4

Characters and Unicode

Total characters4176
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmale
2nd rowfemale
3rd rowfemale
4th rowfemale
5th rowmale

Common Values

ValueCountFrequency (%)
male 573
64.6%
female 314
35.4%

Length

2024-09-14T03:14:40.349149image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-09-14T03:14:40.455276image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
ValueCountFrequency (%)
male 573
64.6%
female 314
35.4%

Most occurring characters

ValueCountFrequency (%)
e 1201
28.8%
m 887
21.2%
a 887
21.2%
l 887
21.2%
f 314
 
7.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 4176
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 1201
28.8%
m 887
21.2%
a 887
21.2%
l 887
21.2%
f 314
 
7.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 4176
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 1201
28.8%
m 887
21.2%
a 887
21.2%
l 887
21.2%
f 314
 
7.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 4176
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 1201
28.8%
m 887
21.2%
a 887
21.2%
l 887
21.2%
f 314
 
7.5%

Age
Real number (ℝ)

Distinct89
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean29.471443
Minimum0.42
Maximum80
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-09-14T03:14:40.567479image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0.42
5-th percentile5
Q120.25
median28
Q338
95-th percentile55.85
Maximum80
Range79.58
Interquartile range (IQR)17.75

Descriptive statistics

Standard deviation14.121908
Coefficient of variation (CV)0.47917261
Kurtosis0.29255909
Mean29.471443
Median Absolute Deviation (MAD)8
Skewness0.44718857
Sum26141.17
Variance199.4283
MonotonicityNot monotonic
2024-09-14T03:14:40.709697image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
22 39
 
4.4%
28 37
 
4.2%
18 36
 
4.1%
24 34
 
3.8%
21 34
 
3.8%
30 33
 
3.7%
19 33
 
3.7%
27 26
 
2.9%
23 25
 
2.8%
29 25
 
2.8%
Other values (79) 565
63.7%
ValueCountFrequency (%)
0.42 1
 
0.1%
0.67 1
 
0.1%
0.75 2
 
0.2%
0.83 2
 
0.2%
0.92 1
 
0.1%
1 7
0.8%
2 11
1.2%
3 7
0.8%
4 11
1.2%
5 6
0.7%
ValueCountFrequency (%)
80 1
 
0.1%
74 1
 
0.1%
71 2
0.2%
70.5 1
 
0.1%
70 2
0.2%
69 1
 
0.1%
66 2
0.2%
65 3
0.3%
64 3
0.3%
63 2
0.2%

Siblings/Spouses Aboard
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.5253664
Minimum0
Maximum8
Zeros604
Zeros (%)68.1%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-09-14T03:14:40.825837image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum8
Range8
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.1046686
Coefficient of variation (CV)2.1026631
Kurtosis17.797537
Mean0.5253664
Median Absolute Deviation (MAD)0
Skewness3.6867598
Sum466
Variance1.2202926
MonotonicityNot monotonic
2024-09-14T03:14:40.927171image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 604
68.1%
1 209
 
23.6%
2 28
 
3.2%
4 18
 
2.0%
3 16
 
1.8%
8 7
 
0.8%
5 5
 
0.6%
ValueCountFrequency (%)
0 604
68.1%
1 209
 
23.6%
2 28
 
3.2%
3 16
 
1.8%
4 18
 
2.0%
5 5
 
0.6%
8 7
 
0.8%
ValueCountFrequency (%)
8 7
 
0.8%
5 5
 
0.6%
4 18
 
2.0%
3 16
 
1.8%
2 28
 
3.2%
1 209
 
23.6%
0 604
68.1%

Parents/Children Aboard
Real number (ℝ)

ZEROS 

Distinct7
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.38331454
Minimum0
Maximum6
Zeros674
Zeros (%)76.0%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-09-14T03:14:41.023256image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.80746591
Coefficient of variation (CV)2.1065361
Kurtosis9.7230659
Mean0.38331454
Median Absolute Deviation (MAD)0
Skewness2.7411981
Sum340
Variance0.65200119
MonotonicityNot monotonic
2024-09-14T03:14:41.121453image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 674
76.0%
1 118
 
13.3%
2 80
 
9.0%
5 5
 
0.6%
3 5
 
0.6%
4 4
 
0.5%
6 1
 
0.1%
ValueCountFrequency (%)
0 674
76.0%
1 118
 
13.3%
2 80
 
9.0%
3 5
 
0.6%
4 4
 
0.5%
5 5
 
0.6%
6 1
 
0.1%
ValueCountFrequency (%)
6 1
 
0.1%
5 5
 
0.6%
4 4
 
0.5%
3 5
 
0.6%
2 80
 
9.0%
1 118
 
13.3%
0 674
76.0%

Fare
Real number (ℝ)

ZEROS 

Distinct248
Distinct (%)28.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean32.30542
Minimum0
Maximum512.3292
Zeros15
Zeros (%)1.7%
Negative0
Negative (%)0.0%
Memory size7.1 KiB
2024-09-14T03:14:41.334586image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile7.225
Q17.925
median14.4542
Q331.1375
95-th percentile112.55749
Maximum512.3292
Range512.3292
Interquartile range (IQR)23.2125

Descriptive statistics

Standard deviation49.78204
Coefficient of variation (CV)1.5409811
Kurtosis33.264605
Mean32.30542
Median Absolute Deviation (MAD)6.9584
Skewness4.7776714
Sum28654.908
Variance2478.2515
MonotonicityNot monotonic
2024-09-14T03:14:41.472794image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8.05 43
 
4.8%
13 42
 
4.7%
7.8958 36
 
4.1%
7.75 33
 
3.7%
26 31
 
3.5%
10.5 24
 
2.7%
7.925 18
 
2.0%
7.775 16
 
1.8%
7.2292 15
 
1.7%
26.55 15
 
1.7%
Other values (238) 614
69.2%
ValueCountFrequency (%)
0 15
1.7%
4.0125 1
 
0.1%
5 1
 
0.1%
6.2375 1
 
0.1%
6.4375 1
 
0.1%
6.45 1
 
0.1%
6.4958 2
 
0.2%
6.75 2
 
0.2%
6.8583 1
 
0.1%
6.95 1
 
0.1%
ValueCountFrequency (%)
512.3292 3
0.3%
263 4
0.5%
262.375 2
0.2%
247.5208 2
0.2%
227.525 4
0.5%
221.7792 1
 
0.1%
211.5 1
 
0.1%
211.3375 3
0.3%
164.8667 2
0.2%
153.4625 3
0.3%

Interactions

2024-09-14T03:14:38.665640image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:33.623309image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:34.027956image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.149251image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.761254image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:33.718517image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:34.097868image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.384828image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.857074image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:33.827362image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:37.950481image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.485281image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.948031image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:33.933305image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.048929image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
2024-09-14T03:14:38.577310image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/

Correlations

2024-09-14T03:14:41.557129image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
AgeFareParents/Children AboardPclassSexSiblings/Spouses AboardSurvived
Age1.0000.156-0.2540.2890.119-0.1990.141
Fare0.1561.0000.4090.4790.1870.4460.282
Parents/Children Aboard-0.2540.4091.0000.0200.2460.4490.155
Pclass0.2890.4790.0201.0000.1270.1470.335
Sex0.1190.1870.2460.1271.0000.2040.539
Siblings/Spouses Aboard-0.1990.4460.4490.1470.2041.0000.186
Survived0.1410.2820.1550.3350.5390.1861.000

Missing values

2024-09-14T03:14:39.063766image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
A simple visualization of nullity by column.
2024-09-14T03:14:39.192180image/svg+xmlMatplotlib v3.9.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

SurvivedPclassNameSexAgeSiblings/Spouses AboardParents/Children AboardFare
003Mr. Owen Harris Braundmale22.0107.2500
111Mrs. John Bradley (Florence Briggs Thayer) Cumingsfemale38.01071.2833
213Miss. Laina Heikkinenfemale26.0007.9250
311Mrs. Jacques Heath (Lily May Peel) Futrellefemale35.01053.1000
403Mr. William Henry Allenmale35.0008.0500
503Mr. James Moranmale27.0008.4583
601Mr. Timothy J McCarthymale54.00051.8625
703Master. Gosta Leonard Palssonmale2.03121.0750
813Mrs. Oscar W (Elisabeth Vilhelmina Berg) Johnsonfemale27.00211.1333
912Mrs. Nicholas (Adele Achem) Nasserfemale14.01030.0708
SurvivedPclassNameSexAgeSiblings/Spouses AboardParents/Children AboardFare
87703Mr. Johann Markunmale33.0007.8958
87803Miss. Gerda Ulrika Dahlbergfemale22.00010.5167
87902Mr. Frederick James Banfieldmale28.00010.5000
88003Mr. Henry Jr Sutehallmale25.0007.0500
88103Mrs. William (Margaret Norton) Ricefemale39.00529.1250
88202Rev. Juozas Montvilamale27.00013.0000
88311Miss. Margaret Edith Grahamfemale19.00030.0000
88403Miss. Catherine Helen Johnstonfemale7.01223.4500
88511Mr. Karl Howell Behrmale26.00030.0000
88603Mr. Patrick Dooleymale32.0007.7500